Model based analysis of a diphone database for improved unit concatenation

نویسندگان

  • Karl Schnell
  • Arild Lacroix
چکیده

One crucial point of concatenation approaches using diphones is to handle the discontinuities between the concatenated units. This problem is treated by a suitable analysis of the diphones for a parametric synthesis. The model of the parametric synthesis is the lossy tube model, which is an extension of the standard lattice filter considering frequency dependent vocal tract losses. The parameters of the tube model are estimated from diphones by an optimization algorithm. The discontinuities of the model parameters at the diphone joints decrease the quality of the synthesis results. To reduce the mismatch of the parameter configurations at the diphone boundaries a specific analysis of a diphone database is proposed, analyzing each diphone with respect to other diphones containing the phonemes of the respective diphone. The parameter mismatches at the diphone joints are reduced improving the concatenation results considerably.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combination of LSF and pole based parameter interpolation for model-based diphone concatenation

For speech generation using small databases, spectral smoothing at the unit joints is necessary and can be realized by an interpolation of model parameters. For that purpose, the LSF are the best choice from the conventional parameter descriptions. This contribution shows how LSF interpolations can be improved using poles as parameters. The problem of the pole assignment between the two pole co...

متن کامل

Diphone synthesis using unit selection

This paper describes an experimental AT&T concatenative synthesis system using unit selection, for which the basic synthesis units are diphones. The synthesizer may use any of the data from a large database of utterances. Since there are in general multiple instances of each concatenative unit, the system performs dynamic unit selection. Selection among candidates is done dynamically at synthes...

متن کامل

Speech Data Analysis for Diphone Construction of a Maori Online Text-to-speech Synthesizer

One of the main types of speech processing technologies today is text-to-speech (TTS) synthesis. A well established speech synthesizer technique called ‘diphone concatenation’ uses a speakers processed speech examples to apply a more human-like response to the TTS synthesis system. This methodology has been used to construct many diphone databases for various languages, and was the basis for bu...

متن کامل

Unit selection for speech synthesi target cos

This paper presents a new approach to unit selection for corpus-based speech synthesis, in which the units are selected according to acoustic criteria. In a learning stage, an acoustic clustering is carried out using context dependent HMM. During synthesis, an acoustic target is generated and segmented in the required diphone sequence. For each diphone to be synthesized, a pre-selection module ...

متن کامل

A biphone constrained concatenation method for diphone synthesis

Diphone concatenation [1] has the advantages of simplicity and a relatively small database of speech when compared to other concatenative synthesis methods (e.g., [2]). However, diphone concatenation faces two notable problems. The first is coarticulation which extends beyond the scope of a single diphone and entails some degree of contextual mismatch for virtually any diphone in at least some ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005